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Abstract — We consider the scenario of broadcasting for real- 
time applications and loss recovery via instantly decodable 
network coding. Past work focused on minimizing the completion 
delay, which is not the right objective for real-time applications 
that have strict deadlines. In this work, we are interested in 
finding a code that is instantly decodable by the maximum 
number of users. First, we prove that this problem is NP-Hard 
in the general case. Then we consider the practical probabilistic 
scenario, where users have the i.i.d. loss probability and the 
number of packets is linear or polynomial in the number of 
users, and we provide a polynomial-time (in the number of users) 
algorithm that finds the optimal coded packet. Simulation results 
show that the proposed coding scheme significantly outperforms 
an optimal repetition code and a COPE-like greedy scheme. 

I. Introduction 

Broadcasting data to multiple users is widely used in several 
wireless applications, ranging from satellite communications 
to WiFi networks. Wireless transmissions are subject to packet 
losses due to channel impairments, such as wireless fading 
and interference. Previous work has shown that coding can 
improve transmission efficiency, throughput, and delay over 
broadcast erasure channels (5|-||7j, (9j, [34) , |35|. Intuitively, 
the diversity of lost packets across different users creates 
coding opportunities that can improve various performance 
metrics. 

In this work, we are interested in packet recovery for real- 
time applications, such as, fast-paced multi-player games and 
live video streaming. Real-time applications have two distinct 
characteristics: (i) they have strict and urgent deadlines, i.e., a 
packet is outdated after a short amount of time, and (ii) they 
can tolerate some loses, e.g., a client can restore the game 
state by resyncing periodically |2|. Despite this fault tolerance, 
these applications can significantly suffer from packet losses 
that often lead to jittery game animation and low quality video 
playback. Hence, it is highly desirable to recover packet losses 
with very low delay and within a very narrow coding window. 
Motivated by the above observations, we focus on coding 
schemes for loss recovery that allows instantaneous decoding, 
i.e., zero delay. These coding schemes are also known as 
Instantly Decodable Network Codes (IDNC). 

Previous work on IDNC (6|, (7), ®-|[T2| focused on 
minimizing the completion delay, i.e., the time it takes to 
recover all the losses at all users. We formulate a different 
problem that is more relevant to real-time applications, called 



Real-Time IDNC: Consider a source that broadcasts a set of 
packets, X, to a set of users. Each user u wants all packets 
in X and already knows a subset of them, H u C X, for 
example through previous transmissions. The goal is to choose 
one (potentially coded) packet to broadcast from the source, 
so as to maximize the number of users who can immediately 
recover one lost packet. This problem is highly relevant in 
practice yet - to the best of our knowledge - only solved in 
heuristic ways so far, e.g., see |j4), |j6j . Our main contributions 
are the following: 

• We show that Real-Time IDNC is NP-hard. To do so, 
we first map Real-Time IDNC to the Maximum Clique 
problem in an IDNC graph (to be precisely defined in 
Section III). We then show that the Maximum Clique 
problem is equivalent to an Integer Quadratic Program 
(IQP) formulation. Finally, we provide a reduction from 
a well-known NP-Hard problem (the Exact Cover by 3- 
Sets) to this IQP problem. 

• We analyze random instances of the problem, where each 
packet is successfully received by each user randomly and 
independently with the same probability. This problem, 
referred to as Random Real-Time IDNC, corresponds to 
a Maximum Clique problem on an appropriately created 
random IDNC graph. Surprisingly, we show that when the 
number of packets is linear or polynomial in the number 
of users, the Maximum Clique problem can be solved with 
high probability on this particular family of random graphs, 
by a polynomial-time (in the number of users) algorithm, 
which we refer to as the Max Clique algorithm. 

We implement and compare our coding scheme, Max 
Clique, against two baselines: an optimal repetition code and a 
COPE-like greedy scheme proposed in [6 |. Simulations show 
that our scheme significantly outperforms these state-of-the- 
art schemes over a range of scenarios, for the loss probability 
varying from .01 to .99. For example, for 20 users and 20 
packets, the our scheme improves by a factor of 1.3 on average, 
and performs up to 1 .6 times better than the COPE-like scheme 
and up to 3.8 times better than the optimal repetition code. 

The remainder of this paper is organized as follows. Section 
[Ii] discusses related work. Section III formulates the problem. 
Section IV describes the maximum clique and integer program 
formulations as well as the proof of NP-completeness. Sec- 



tion [V] analyzes the probabilistic version (Random Real-Time 
IDNC problem) and provides the polynomial-time algorithm 



to find a maximum clique w.h.p. Section VI evaluates and 
compares our coding scheme with existing schemes. Section 
|VII| concludes the paper. 



II. Related Work 

Instantly Decodable Network Coding. Katti et al. p) pro- 
posed COPE, an opportunistic inter-session network coding 
scheme for wireless networks. Encoded packets are chosen 
so that they are immediately decodable at the next hop. The 
algorithm considers combining packets in a FIFO way (as 
stored in the transmitting queue) and greedily maximizes the 
number of receivers that can decode in the next time slot. 
Keller et al. |6| investigate algorithms that minimize decoding 
delay, including two algorithms that allow for instantaneous 
decoding: a COPE-like greedy algorithm and a simple repeti- 
tion algorithm. In Section [VI] we use these two algorithms as 
baselines for comparison. 

Sadeghi et al. f7j improved the opportunistic algorithm in 
j6) by giving high priority to packets that are needed by a 
large number of users. The authors also gave an Integer Linear 
Program formulation to the problem of finding an instantly 
decodable packet that maximizes the number of beneficiary 
users. Furthermore, they show that it is NP-hard based on the 
Set Packing problem. We note that their formulation differs 
from ours since they require that a coded packet must be 
instantly decodable by all users, where there may be some 
users that do not benefit from the packet. This may lead to 
a suboptimal solution, i.e., there may be a coded packet that 
is only instantly decodable by some but not all users but is 
beneficial to a larger number of users. Our formulation ensures 
that we find this optimal packet. 

Sorour et al. have an extensive line of work investigating 
instantly decodable codes |8[-||T3|, focusing on minimizing 
the completion delay. They introduced the term Instantly 
Decodable Network Coding (IDNC) that we adopt in this 
work. They propose a construction of IDNC graphs based 
on feedback from the users in |8| and then introduce a 
transmission scheme based on graph partitioning. We consider 
the same construction of IDNC graphs as in (§|. Based on a 
stochastic shortest path formulation, the authors in [9] propose 
a heuristic algorithm to minimize the completion delay. In 
[ 10 1, they introduced the notion of generalized IDNC problem, 
which does not require the transmitted code to be decodable 
by all users, as in the strict version studied previously (6), 
||8j, 0. Real-Time IDNC considers the generalized version. 
Furthermore, in flO) , they relate finding an optimal IDNC 
code to the maximum clique problem in IDNC graphs and 
suggests that it is NP-Hard; however, no explicit reduction 
was provided. In (TTJ and p2) the authors extend J9) to cope 
with limited or lossy feedback and in JT3}, to use multicast 
instead of broadcast. 

Li et al. p4[ use IDNC for video streaming and show that, 
for independent channels and sufficiently large video file, their 
schemes are asymptotically throughput-optimal subject to hard 



deadline constraints when there are no more than three users. 
In contrast, we consider an arbitrary number of users and we 
provide the optimal single transmission. 

Index Coding. Our problem setup is the same as in the 
Index Coding (IC) problem, introduced by Birk and Kol (JT3J 
and extensively studied since. Considers a base station that 
knows a set of packets X and a set of users. Each user 
(x, H) demands one particular packet x £ X and has side 
information consisting of a subset of packets H C X. The 
base station broadcasts to all users without errors. The goal 
is to find an encoding scheme that minimizes the number of 
transmissions required to deliver the packets to all users. There 
are two differences between our problem and IC. First, in our 
problem each user wants all the packets, not just a a single 
packet. Second, fwe want to find an instantly decodable packet 
that maximizes the number of users that benefit in a single 
transmission, not the total number of transmissions to satisfy 
all users. 



Data Exchange was introduced by El Rouayheb et al. \22\ 



and has similar seteup: there is a set of packets X and a set 
of users. Each user u knows a subset of packets, H u C X, 
and wants all packets in X. Unlike the IC, there is no base 
station and the users broadcast messages. Different from IC, 
the objective is to find an encoding scheme that minimizes 
the number of transmissions required to deliver all packets in 
X to all users. Similar to the data exchange problem, in our 
setting, all users want all the packets in X. There are, however, 
two differences: (i) in our setting, only the base station can 
broadcast as opposed to having all users broadcasting, and (ii) 
we are interested in instantaneous decoding to maximize the 
number of beneficiary users for one transmission, as opposed 
to minimizing the total number of transmissions. 

III. Problem Formulation 

Let U = {«!,••• ,u n } be the set of n users, and V = 
{pi, ■ ■ ■ ,p m } be the set of m packets. We assume that the 
original m packets were broadcast by a base station. Due to 
packet loss, each of n users missed some of the m packets. 
We denote the set of packets that were successfully received 
by user i by Hi- Furthermore, let Wi be the set of packets 
that user i still wants, i.e., Wi = V\ 'Hi. Consistent with |8| 
and fl6] , we call H's and W's the "Has" and "Want" sets. 

After the initial broadcast, the base station tries to recover 
the losses, W's, by sending coded packets and exploiting the 
side information of the already delivered packets, Ws. Let 
the n x m matrix A be the identification matrix for the side 
information of the users, i.e., entry dij — 1 if user ut wants 
packet pj and otherwise. A is also called a feedback matrix, 
as in [8|-[13|. Let us clarify this by an example. 

Example 1. Consider a scenario with 3 users and 6 packets. 
Furthermore, assume that after the initial broadcast, user ui 
successfully received packets p\ and P2', user U2 received 
P3 and p$; and user U3 received p% and p$. Thus the side 




Fig. 1, The Instantly Decodable Network Coding (IDNC) graph of Example 
[T] Solid ed ges are edges of type (i) and dashed edges are edges of type 
(li). There are three maximum cliques: {fi3, V21, V31}, {v±3, V22, V32}, and 
{•Ul4> V24, U34}, all of which are of size 3. 

information matrix is 

/ 1 1 1 1 \ 
A= 110 10 1. 
V 1 1 1 1 / 

To deliver the packets in the Want sets of the users, we 
focus on instantly decodable, lightweight coding schemes that 
operate in GF(2). For a set of packets A4, their corresponding 
coded packet c is c = ® p . gj vi Pi' wnere © denotes the binary 
sum. 

Definition 1. A coded packet, c^, is instantly decodable with 
respect to a set of users, Af, if and only if 

(i) Every user, ui £ Af, can decode immediately upon 
reception to recover a packet p % £ Wi. That is, each user 
in Af benefits from by recovering one of the packets 
from its want set. 

( ii) Every packet in the binary sum of is wanted by at 
least one user in Al '. 

For example, for the scenario of Example [T] the coded 
packet c ^ Ul ' U2 ' U3 ^ — pi©p3 is instantly decodable with respect 
to {ui, it2, 1*3} since u\ can recover p^, while 112 and 113 
can get p\. Meanwhile, Cj" 2 '" 3 ^ = p$ © Pe is not instantly 
decodable with respect to u\. Furthermore, we do not consider 
Cj" 2 '" 3 ' = P3 © P5 © Pq instantly decodable with respect to 
{112,113} since although c^" 2 '" 3 ^ can be decoded by 112 and 
u 3 , p 3 , which is a component of c^ U2 ' U3 \ is not wanted by 
either it 2 or 113. From here on, we may omit the superscript 
Af of when there is no ambiguity. 

We would like the coded packet to be immediately beneficial 
to as many users as possible. Thus, our notion of optimality is 
with respect to the cardinality of the set of beneficiary users 

The Real-Time IDNC Problem: Given a side information 
matrix A, find the optimal instantly decodable packet . 

IV. Maximum Cliques in IDNC Graphs 

Given a side information matrix A, we form an Instantly 
Decodable Network Coding (IDNC) graph corresponding to 
A as in (SI: We create a vertex Vij when user Ui still wants 
packet pj . For instance, for matrix A in Example [T] there is a 



vertex for each entry 1 in the matrix. Given a vertex t>y, we 
use the term user index of wy to indicate i and packet index 
of Vij to indicate j. There is an edge between two vertices 
and Vki if one of the below conditions hold: 

(i) j = I: In this case, both users ui and Uk wants the same 
packet p — Pj = Pe.- 

(ii) Pj £ Hk and pi £ Hf. In this case, user Uk has packet 
Pj that user Ui still wants, and vice versa. 

Denote the IDNC graph corresponding to a matrix A by G A = 
(V,£). Figure [T] shows the IDNC graph corresponding to the 
side information matrix given in Example [T] 

A. Cliques and Instantly Decodable Packets 

Proposition 1. Finding an optimal instantly decodable code 
given a side information matrix A is equivalent to finding a 
maximum clique in the corresponding IDNC graph G . 

We prove this proposition by establishing the following 
Lemmas [2] and [3] The first lemma states the relationship 
between instantly decodable packets and cliques in G . 

Lemma 2. Given a side information matrix A and its IDNC 
graph G A , an instantly decodable packet has a one-to-one 
correspondence to a clique in G A . 

Proof: 

(<^) We first show that a clique in G A maps to an instantly 
decodable packet, which is uniquely identified by the user and 
packet indices of the vertices in the clique. 

Let C be a clique in G A : C = {fi^, • • • >Vi k ,j k }- Without 
loss of generality, assume ,jk are pair- wise distinct, 

compute c = p jl © • • • © p jk . (If j tl = jt 2 = • • • = jt n ,n>l 
then include only j tl in the XOR.) c is an instantly decodable 
packet with respect to the set of users {u.^ , ■ • • , Ui k } because 

• For any user u, t , for some t € [l,k], the existence of 
vertex w, t .j t indicates that it wants pj t . In the following, 
we show that u; t can decode for pj t immediately upon 
receiving c. Without loss of generality, consider user u i% . 
It suffices to show that has all other packets in c. To 
see this, assume otherwise, i.e., assume does not have 
packet pj B , for some s £ [2,k] where pj e 7^ pj 1 . Then 
there is no edge between Vi 1 j 1 and Vi s j s , (contradiction) 

• Each component, Pj t ,t £ [1, k], of c is wanted by it; f . 

(=^>) We now show that an instantly decodable packet maps 
to a clique in G A , which is uniquely identified by the packets 
involved and the set of beneficiary users. Let — pj x © • • • © 
Pj k be an instantly decodable packet with respect to the set of 
user Af. Let pj t be wanted by distinct users , • • • , v?£ }, 
for some n t > 0. The following set of vertices, C, form a 
clique in G A : 

We will show that there is an edge between any two vertices 
in C: 



• For any t £ [1, k], consider any pair a ^ b, a, b £ [1, n t ). 
There is an edge of type (i) between v h and w Jt . 

since both u£ and it j need p Jt . 

• For any pair of s ^ t, s, t £ [1, fc], consider a G [1, n s ] and 
b £ [1, Jit]. There is an edge of type (ii) between v u j s . 
and » i, . . This is because ui 3 must have as it can 

decode for p Js immediately, and u^ 4 must have p 3s as it 
can decode for pj t immediately. 

Finally, it is easy to check that for the above two mappings, 
one is the reverse mapping of the other. ■ 
For instance, let us consider the clique involving w 13 , w 2 i 
and U31 in Example 1. XORing all packets corresponding 
to vertices of this clique, i.e., p\ © p^, forms an instantly 
decodable packet because (i) user 1 must have p\, and users 
2 and 3 must have p%, otherwise there are no edges (v%3, V21) 
and (t>i3, U31), and (ii) each component of the coded packet is 
wanted by the user corresponding to the row of the vertex. The 
next lemma expresses the relationship between the number of 
users benefiting from an instantly decodable packet and the 
size of the clique corresponding to the packet. 

Lemma 3. Given a side information matrix A and its IDNC 
graph G A , let be an instantly decodable packet, and let 
C be the corresponding clique of in G A , then \C\ = \j\f\. 

Proof: Let = pj 1 © • • • ®Pj k be an instantly decodable 
packet w.r.t. the set of user j\f. Let pj t , t £ benefits 
distinct users {u^, ■ ■ ■ , u%}, for some n t > 0. The following 
set of vertices, C, forms the clique corresponding to c^: 

1 «i Ji »~iJl' ' "1 ,3k' ' u n \,] k ' 

To show \J\f\ = \C\, it suffices to show that all user indices 
of vertices in C are pair-wise distinct. For any pair of s ^ t, 
where s,t £ [1, k], consider any a £ [1, n s ] and any b £ [1, n t ]. 
u a a 7^ u b* because otherwise cannot decode for pi s . ■ 

B. NF '-Completeness 

Finding a maximum clique in a general graph is well known 
to be NP-Hard. This result, however, is not directly applicable 
to IDNC graphs as they have special structural properties. 
In this section, we will show that the problem of finding a 
maximum clique in an IDNC graph is indeed NP-Hard. We 
show this by first showing that finding a maximum clique in 
an IDNC graph is equivalent to finding an optimal solution 
to an Integer Quadratic Programming (IQP) problem. We 
then describe a reduction from a well known NP-Complete 
problem, the Exact Cover by 3-Sets problem, to the decision 
version of the IQP problem. 

1) Integer Quadratic Programming Formulation: Given a 
side information matrix A of size n x m, we formulate the 
IQP problem as follows. Let r be a binary nxl vector: r, £ 
{0, = 1, • • ■ ,n. Similarly, let c be a binary m x 1 vector: 
Cj £ {0, = 1, • • • , m. Below is the IQP problem for A: 



Maximize: V = r T Ac = Yh=i YJjLi r i c j a i3- 

Subject to: r, J^j=i c j ay < 1, Vi = 1, • • • , n . (1) 
n,Cj €{0,1}. (2) 



Proposition 4. Given a side information matrix A and its 
IDNC graph G A , finding a maximum clique in G A is equiva- 
lent to finding an optimal solution to the corresponding IQP. 

We prove this proposition by establishing the following 
Lemmas [5] and [6] The first lemma expresses the relationship 
between the above IQP problem and the problem of finding 
maximum clique in G . 

Lemma 5. Given a side information matrix A and its IDNC 
graph G A , a clique in G A has a one-to-one correspondence 
to a feasible solution of the IQP problem for A. 

Proof: 

We first show that a clique in the IDNC graph maps to 
a feasible pair of vectors r and c of the IQP problem, which 
is uniquely identified by the user and packet indices of the 
vertices in the clique. 

Let C be a clique in G : C = {vi 1 j 1 , Vi k .j k }. Let I be 
the set of user indices: X = {ii, ■ ■ ■ , and J be the set of 
packet indices: J = {ji, ■ ■ ■ ,jk}- We create the feasible pair 
of r and c as follows: Set r*j = 1 if i £ I and otherwise, 
and set Cj = 1 if j £ J and otherwise. 

To show that this pair of vectors is a feasible solution, we 
proceed by showing that condition (1) of the IQP holds for 
all user indices. Let i be any user index, i £ [1, n]. It is clear 
that (1) holds if rj = 0. When r; = 1, it suffices to show that 
no two vertices of C have the same user index. Indeed, this 
follows from the observation that there is no edge between any 
two vertices having the same user index (on the same row) in 
the IDNC graph. 

(<=) Next, we show that a feasible solution of the IQP maps 
to a uniquely identified clique in the IDNC graph. Let the pair 
of vectors r and c be a feasible solution. We map this pair to 
a clique C in G A as follows: Initialize C = 0. For i £ [l,n], 
for j £ [1, to], if r,; = Cj = ay = 1, add vertex Vij to C. 

Now pick any pair of vertices v st and v pq in C. It is clear 
that if t = q, there is an edge between these two vertices. It 
remains to show that when t ^ q, user u s has packet p q and 
user Up has packet p t . We will show that user u s must have 
packet p q . The other condition follows by symmetry. Assume 
otherwise, i.e., user u s does not have packet p q , which means 
a sq = 1. Since C contains v st and v pq , r s — ct = a st = 1 and 
r p = c q = a pq = 1. But then for row s, condition (1) of the 
IQP problem fails since 

m 

r s 2J Cj a-sj > r s c t a st + r s c q a sq = 2 . 
3=1 



Finally, one can readily check that for the above two 
mappings, one is the reverse of the other. ■ 

Lemma 6. Given a side information matrix A and its IDNC 
graph G A , the size of a clique in G A equals to the objective 
value V of its corresponding feasible solution of the IQP 
problem for A. 

Proof: Let C be a clique in G and r and c be the pair 
of vector of the corresponding feasible solution. For any user 
index i and packet index j, if G C, then = Cj = a.^ = 1. 
Hence, every vertex in the clique adds 1 to V. ■ 
2) Reduction from Exact Cover by 3 -Sets: Given a side 
information matrix A, the decision version of the IQP problem 
for A, denoted as D-IQP, asks the following question: "Is there 
a feasible solution whose objective value equals N, for some 
N > 0?" 

Proposition 7. The D-IQP problem is NP-Complete. 

Proof: Clearly, D-IQP is in NP since given a feasible 
pair of vectors r and c, we can compute the objective value 
in polynomial 0(nm) time. 

In what follow, we show a reduction from the Exact Cover 
by 3-Sets (X3C) problem to D-IQP. X3C is well-known to be 
an NP-Complete problem J32) and is defined as follows: 

Definition 2. Given a set £ of 3k elements: £ = 
{ei, • • • , e3fe}, and a collection T = {Si, ■ ■ ■ , Se} of subsets 
Si d £ and \Si\ = 3, for i £ [1,£],£ > k. The X3C problem 
asks the following question: "Are there k sets in J- whose 
union is £?" 

The reduction: Given any instance of X3C, we create 3k 
users, ui,--- ,U3fc, and £ packets, pi,--- ,pt. The users 
correspond to the elements e,-,i £ [l,3fc], and the packets 
correspond to the sets Sj,j £ [1,^]. We form the side 
information matrix A X3C corresponding to this X3C instance 
by setting ay = 1 if ej € Sj and otherwise. 

Next, we will show that there is a feasible solution to the 
D-IQP for A X3C whose objective value V equals 3k if and 
only if there are k sets Sj 1 , Sj k whose union is £ . 

Let r and c be the pair of vectors of the feasible solution 
whose objective value V = 3k. First, observe that all r i5 for 
i = 1,-" ,3k, must equal 1; otherwise, assume there exists 
some index t € [1, 3k] where r t = 0, then 

3fe I 3k I 

V = Y T * Y C 3 a V = Y Ti Y C 3 ai 3 < 3/C ' 
i=0 j=0 i=0,i=/Lt j=0 

since each term ^iX]j=o c j ai J ^ s at most 1 by constraint (1). 
This is a contradiction. 

Next, we create the corresponding solution to the X3C 
problem using c. In particular, for j = 1, • • ■ ,£, we select 
Sj if cj = 1. Because V = 3k and rj = 1 for all i, 

t 

Cjdij — 1, for i = 1, • • • , 3k. 



Thus, for a user index s £ [1, 3k], there exists a unique packet 
index t £ [1, £], where q a st = 1, which means c t = a s t = 1. 
By construction, we selected set St, and also this St covers 
element s as a st — 1 . Therefore, each element is contained 
in exactly one set. 

(4=) Let Sj 1 , Sj k be the solution to the X3C problem. 
We create the corresponding solution to the D-IQP problem 
as follows. First, for r, let = 1, for alH = 1, • • • , 3k. Then, 
for c, let J = {ji , jk}, and for j = 1, • • • ,£, set Cj = 1 
if j £ J and otherwise. Since Sj ± , Sj k covers all 3k 
elements and each set has only 3 elements, each element e s 
appears in exactly one set Cj t for some t G [1, fc], and Cj t = 1. 
Thus, for each element s€ [1, 3fc], 

t 

Y c J a *3 = c h a *3t =1-1 = 1 
Given the above r and s, 

3k I 

V = r i ^Ys c j a v = 3fc • 1 = 3fc . 

i=0 j=0 

m 

From Propositions [T] |4j and |7J we have the following main 
result of this work. 

Theorem 8. Given a side information matrix A and its IDNC 
graph G A , finding a maximum clique in G A , and equivalently, 
an optimal instantly decodable packet, is NP-Hard. Their 
corresponding decision versions are NP-Complete. 

V. Maximum Cliques in Random IDNC Graphs 

In this section, we investigate Random Real-Time IDNC. 
In particular, we assume that each user, u,, i £ [l,n], fails 
to receive a packet, pj, j £ [l,m], with the same probability 
p £ (0, 1) independently. For ease of analysis, we assume 
that m is linear in n: m = dn, for some constant d > 0. 
(Our results also hold when m is polynomial in n.) A random 
IDNC graph, denoted as G A {p), is the graph corresponding to 
a side information matrix A whose each entry equals 1 with 
probability p and with probability q = 1 — p independently. 
Next, we analyze the size of the maximum clique, i.e., the 
clique number, of random IDNC graphs. 

The main results of this section are the followings: 

(i) For any p £ (0,1), the clique number for almost every 
graph in G A (p) is linear in n. In particular, it equals 
j*pq* ~ 1 n, where j* = argmax jpq^ 1 , j* £ N. With 
high probability, the optimal recovery packet involves 
combining j* packets. 

(ii) With high probability, the maximum clique can be found 
in polynomial time, Oinm? +s ), where S is a small 
constant parameter, and we provide an explicit algorithm 
to find it. Consequently, the optimal recovery packet can 
be computed in polynomial time in n. 

Comparison to Erdos-Renyi Random Graphs: Clique num- 
bers of Erdos-Renyi random graphs with n vertices and 



p = 1/2 are known to be close to 21og 2 n. However, it is 
widely conjectured that for any constant e > 0, there does not 
exist a polynomial-time algorithm for finding cliques of size 
(1 + e) log 2 n with significant probability [33 1. In contrast, for 



random IDNC graphs with n x m vertices, where m is linear 
or polynomial in n, we show that the clique numbers are linear 
in n, and corresponding cliques can be found in polynomial 
time in n. 

A. Clique Number of Random IDNC Graphs 

First, observe that any k l's that lie in the same column 
form a clique of size k. Since the expected number of l's 
in a single column is np, the expected size of single-column 
cliques is np. As a result, we expect the maximum clique size 
to be linear in n. 

Fix a set Cj of j columns. A row r is said to be good with 
respect to Cj if among the j columns, it has 1 one and j — 1 
zeros. The probability that a row is good w.r.t. Cj is 



f(j) = jpq 



(1) 



Let Z^. be the number of good rows w.r.t Cj. Then Zq,. has 
a binomial distribution: Bin(n, f(j)). 

Let Xc be the size of the maximum clique that has at least 
one vertex on every column in Cj, i.e., the clique touches j 
columns. Observe that if j = 1, then /(l) = p, and Xc 1 = 
Zc x , which is the number of l's in the chosen column. Thus, 
Xc 1 has a Binomial distribution: Bin(n,p). For j > 1, Xc j ^ 
Zq since the set of good rows may not have a 1 in every 
column in Cj. The following lemma states that for a large 

def 

k, where k = Zc j ~ Bin(n, f(j)), i.e., given large enough n, 
Xc j = Zc with high probability. 

Lemma 9. For a set of constant j columns Cj, there exists a 
constant kj > such that for all k > kj, 



Pr[Z C] =X C] \Z C] =k] >l-j 



3-1 

3 



Proof: For k > j > 0, let Bj, denote the number of ways 
to put k l's into a matrix of size k x j such that 

• each row has one 1, and 

• each column has at least one 1. 

Note that B\ = 1, and we have the following recurrence: 



i-2 



J 

i-i 



Bl. (2) 



This recurrence states that the number of ways to put k l's 
into k rows (each row has one 1) using exactly j columns 
equals to the number of ways to put k 1 's into k rows without 
any column restriction subtracts the cases where there are 
1, 2, • • ■ — l empty columns. It can be shown by induction 
that 



2=0 



(3) 



In detail, assume that <j3j is true for all indices 1, 2, • • ■ ,j- 
1, then following from recurrence pj), 



Bl = f 
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Now, for any t G [1, j — 1], the coefficient of (j — t) k is 

t 



E 



3\(3 - i 

t - i 



(-1) 



t— i+l 



Thus, it suffices to show that, for any t € — 1], 

t 



E 



3\(3 - i 

t - i 



(-1) 



t-i+l 



The above equation holds iff 



E 



(~l) f 
i! 



Or, equivalently 



= (-i) 



The LHS of the above equation equals 



t+i 



■<-i>'+E Qc-i 



)*-* = (-1) 



*+i 



where the last "=" follows from the binomial theorem (for 
a = 1, b = — 1). This complete the induction to show 
From (3), we have that 

l ' / 3'3 1:' ' E [ •'('' ) 

i=2 



(4) 



Let fcj be the minimum positive integer value of k such that 
EtaC -1 )* (i)C? ~ *)' £ > 0. Then, for all fc > 



Pr[Z Cj = X Cj | Z Cj = fc] = ^ > J 3 f k > 



The following lemma states that Xc-, the size of the 
maximum clique that touches all j columns, heavily concen- 
trates around nf(j) for large n. Intuitively, this follows from 
Xc j = Zq. w.h.p. (Lemma |9jl, and the fact that the Binomial 
distributed Zq , the number of good rows, concentrates heavily 
around its mean, nf(j). 



Lemma 10. For a set of constant j columns Cj and any 

constant c > 1, let /! = nf(j) and S — \J ^nf(j) ' ^ or a 
large n such that \i — jj,S > kj (kj is as in Lemma^, we have 

Pr[ | X Cj - M I > l*S\ < - c + 2fiSj(l - -r-» 5 . 

This probability goes to as n — > oo. 

Proof: Denote Zc and JQ>. by Z and X, respectively. 
Applying Chernoff's bound on the Binomial distributed vari- 
able Z, we have 



Pr{\Z - n\ > < 2exp(- 



fiS 2 



2 




Now, 
Pr[\X-(i\ >n6] 

n 

= ^ Pr[ | X - [i | >p,6\Z = k]- Pr[Z = k] 
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Number of columns (j) 
Fig. 2. Plot of f(j) = jp(l — p) J ~ for different loss rate p 
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< — + (2u6 - 2)j(l - (from Lemma [9]) 
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Cjfig- 3. Values of f(j*) and its corresponding j*. The clique number heavily 
concentrates around f(j*) X n, and j* is the number of packets should be 
coded together. 

result of Lemma [TOj 

Pr[ | Xj - /x | > M<5] = Pr[U Cj | X C] - M | > M 5] 



Note that /i<5 is 0(\/n Inn); thus, Xc j is within Q(yfn In n) 
of nf(j) w.h.p. Next, for a constant j, let X, be the size of the 
maximum clique that touches any j columns. Xj also heavily 
concentrates around nf(j). Recall that m, = dn. Formally, 

Theorem 11. For a constant j and any constant c > j, let /j = 
n f(j) an d S = tJ 3 nf(f) ■ F° r a large n such that fi— /iS > kj 
( kj is as in Lemma [9|, we have 

Pr[ \Xj-fj,\ >fj,5]< + 2d%V j(l - "J" - " 4 ■ 

This probability goes to as n — > oo. 

Proof: The proof is by using the union bound on the 



< I . jVr[\X ej -»\ >fiS] 

< mP (J- c + 2^i(l - l/jy-» & 

< — +2#n^6j(l-l/jr-^ 



We note that the above concentration result also holds when 
the number of packets, m, is polynomial in the number of 
user, n, i.e., m = n d , for some constant d > 0. However, 
it needs a larger constant c (c > dj), which means less 
concentration. Apparently, the results do not hold when m is 
exponential in n. However, the cases where m is either linear 
or polynomial in n are sufficient for practical purposes as in 
real-time applications, such as Q, m is often linear in n. 

Now let j* = argmaxj(j), j* e N. There may be a set of 



consecutive values of j £ N that maximize f(j), in that case, 
pick j* to be the smallest one among them. For a constant p, 
j* and f(j*) are also constant. 

Corollary 12. For a sufficiently large n, with high probability, 
the maximum clique touches a constant number j* of columns, 
where j* — argmaxf(j). 

Proof: Intuitively, this follows from the above result that 
the size of the maximum clique that touches j columns heavily 
concentrates around nf(j). In detail, for any constant j' such 
that f(j') < f(j*). Let c > max(f,f) + 1. Theorem [TT 
implies that with high probability, the size of the maximum 
clique that touches any j' column is at most 

k' = nf(f) + v / 3c/(i')n Inn, 

and the size of the maximum clique that touches any j* 
column is at least 

k" = nf{f ) - y/Zcf(j*)n\nn. 

For a large enough n, it is clear that k' < k". ■ 
Fig. |2]plots the function f(j) for different values of p. This 
plot shows that (i) for p >= 0.5, f(j) is a decreasing function, 
and for p < 0.5, f(j) initially increases then decreases, 
and (ii) j* increases as p decreases, which suggests that the 
number of packets should be coded together increases when 
the loss rate decreases. Fig. [3] plots the values of f(j*) and 
the corresponding values of j*. An important observation from 
Fig. [3] is that even when the loss rate is small, the clique 
size is still high. For instance, when p = 0.1, j* = 9 and 
f(j*) ~ 0.38, i.e., the optimal coded packet involves coding 
9 plain packets together, and this packet will benefit about 
38% of the users. 

B. Finding a Maximum Clique 

Based on the analysis in the previous section, we propose 
Algorithm [T] to find a maximum clique of a given random 
IDNC graph. Algorithm [T] examines all cliques that touch j 
columns, for all j combinations of to columns, where j is 
within a small constant 6 neighborhood of j*. In the case j* 
is larger than to, j* is set to equals to to (Line 1), exploiting 
the fact that for j < j*, f(j) is an increasing function as 
shown in Fig. [2] 

Complexity. In Algorithm [T] the for each loop starting at 
Line 3 runs at most 2<5( j „™ (5 ) times. The for loop starting at 
Line 5 runs n times. The if condition check at Line 6 examines 
up to j* + S entries. Thus, the total runtime of Algorithm [T] 
is at most 25(.»™ (5 )n(j* + 6) = 0(nm j * +s ), i.e., polynomial 
in n when m is linear or polynomial in n. 

Given the vertices of the maximum clique output by Algo- 
rithm [T] one can readily compute the optimal instantly decod- 
able packet by XORing the packets whose indices correspond 
to the packet indices of the output vertices, as indicated in 
Proposition [T] 



Algorithm 1 Finding the Maximum Clique 

Input: p: loss probability, n: number of users, to: number of 

packets, A: side information matrix of size n x m. 
Output: 1*: vertices of the maximum clique 

1: j* <- min(TO,argmax jeN /(j)) 
2: 1* = 

3: for each combination of j columns out of m columns, 

where j 6 [j* — 5, j* + 8} 
4: 1 = 
5: for r = 1 — » n do 

6: if row r has only one 1 at column c then 

7: Add (r, c) to X 

8: end if 

9: end for 
10: if \T\ > \X* I then 
11: 1* =1 

12: end if 
13: end 



VI. Performance Evaluation 

In this section, we use simulation to compare the perfor- 
mance of the proposed Max Clique algorithm (Algorithm [T| 
against two baselines proposed in [6|: an optimal repetition- 
based algorithm, called Best Repetition and a COPE-like 
greedy-based algorithm. The Best Repetition algorithm re- 
broadcasts the plain packet that is wanted by the most number 
of users. This is inherently the best repetition strategy. The 
COPE-Like algorithm goes through all the packets that are still 
wanted by at least one user in a random order, and it tries to 
compute a coded packet that is instantly decodable to all users. 
In particular, it begins by selecting the first packet, c = p\. 
It then goes through the rest of the packets one by one. At 
each step j, j > 1, it XORs the packet pj under consideration 
with c: c = c® pj, if the result is still instantly decodable 
to all users, and it skips pj otherwise. For reference, we also 
include the Random Repetition algorithm, which resends a 
random packet that is still wanted by at least one user. 

Setting. For each loss rate ranging from 1% to 99% per 
1% increment, we randomly generate 100 side information 
matrices. We then run the algorithms on these matrices. For 
the Max Clique algorithm, we set 5, the neighborhood around 
j*, to 3. Fig. [4] plots the average numbers of beneficiary 
users as a function of loss rate for the two parameter settings 
{n = 20, to = 20} and {n = 40, to = 20}. For clarity, we 
skip plotting the standard deviations: they are ranging from 
to 3 for all algorithms. 

Results. In Fig. |4j we can see that the proposed Max Clique 
algorithm consistently and significantly outperforms all other 
algorithms. In particular, for the case {n = 20, to = 20}, on 
average, Max Clique performs 1.3 times better than both the 
Best Repetition and COPE-Like. For the loss rates between 
40% and 50%, Max Clique performs up to 1.6 times better 
than the COPE-Like algorithm, and for the loss rates between 
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(a) n = 20 users, m = 20 packets 
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(b) n = 40 users, m = 20 packets 



Fig. 4. Performance of the proposed Max Clique coding scheme in comparison with those of the Best Repetition and COPE-Like coding schemes. 



10% and 15%, Max Clique performs up to 3.8 times better 
than the Best Repetition algorithm. Similar trend but higher 
improvement, 1.35 times on average and up to 4.5 times, could 
be observed for the case {n = 40, m = 20} in Fig 4(b) 



When the loss rate is larger than a certain threshold (65% 



in Fig. 4(a) i, the performance of Max Clique is similar to 
that of the Best Repetition, which suggests that Max Clique 
also tries to select the best uncoded packet. This is because 
a plain packet now benefits many users due to high loss rate. 
When the loss rate is larger than another threshold (50% for 



Fig. 4(a) i, the performance of COPE-Like is similar to that 



of Random Repetition, which suggests that packets cannot be 
coded together while being instantly decodable to all users. 
This is because when the loss rate is high, given any pair of 
2 plain packets, there exists a user who lost both w.h.p. 

VII. Conclusion 

In this paper, we formulate the Real-Time IDNC problem, 
which seeks to compute a recovery packet that is immediately 
beneficial to the maximum number of users. Our analysis 
shows that Real-Time IDNC is NP-Hard. This is shown by 
first mapping the problem to the problem of finding maximum 
cliques in IDNC graphs and then providing a reduction from 
the Exact Cover by 3-Sets problem. We then analyze the 
Random Real-Time IDNC, where each user is assumed to 
lose every packet with the same probability p independently. 
When the number of packets is linear or polynomial in the 
number of users, we show that the optimal packet could be 
computed in polynomial time in the number of users with high 
probability. We achieve this by providing an algorithm that is 
capable of finding a maximum clique in a random IDNC graph 
in polynomial time, which could be of independent interest. 
In the future, we plan to extend this work from a single time 
slot to a constant number of time slots, corresponding to delay 
tolerance greater than zero. 
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